Crowdsourced on-boarding of digital assistant operations

ABSTRACT

Embodiments described herein are generally directed towards systems and methods relating to a crowd-sourced digital assistant and system. In particular, embodiments facilitate the intuitive creation and distribution of action datasets that include computing events or tasks that can be reproduced when an associated command, stored in an action dataset, is determined received by a digital assistant device. The digital assistant device described herein can generate new action datasets, on-board new action datasets, and receive new action datasets or updates to existing action datasets. Each digital assistant device in the described system can participate in the building of action datasets, so as to crowd-source a dialect that can be understood by a digital assistant device.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.15/984,122 filed May 18, 2018 now U.S. Pat. No. 11,520,610, which claimsthe benefit of U.S. Provisional Patent Application No. 62/508,181, filedMay 18, 2017, and entitled SYSTEMS AND METHODS FOR CROWDSOURCED ACTIONSAND COMMANDS, which are assigned or under obligation of assignment tothe same entity as this application, the entire contents of theapplications being herein incorporated by reference.

BACKGROUND

Digital assistants have become ubiquitous in a variety of consumerelectronic devices. Modern day digital assistants employ speechrecognition technologies to provide a conversational interface betweenusers and electronic devices. These digital assistants can employvarious algorithms, such as natural language processing, to improveinterpretations of commands received from a user. Consumers haveexpressed various frustrations with conventional digital assistants dueto privacy concerns, constant misinterpretations of spoken commands,unavailability of services due to weak signals or a lack of signal, andthe general requirement that the consumer must structure their spokencommand in a dialect that is uncomfortable for them.

SUMMARY

This summary is provided to introduce a selection of concepts in asimplified form that are further described below in the detaileddescription. This summary is not intended to identify key features oressential features of the claimed subject matter, nor is it intended tobe used in isolation as an aid in determining the scope of the claimedsubject matter.

Embodiments described in the present disclosure are generally directedtowards systems and methods relating to a crowd-sourced digitalassistant for computing devices. In particular, embodiments facilitatethe intuitive creation and distribution of action datasets that includereproducible computing events and associated commands that can beemployed to invoke a reproduction of the computing events on computingdevices having the crowd-sourced digital assistant installed and/orexecuting thereon. The described embodiments describe a digitalassistant system and application that can perform any operation on acomputing device by way of a received command, the operations beinglimited only by the various operations executable on the computingdevice.

In accordance with embodiments described herein, the described digitalassistant and corresponding system provides an ever-growing and evolvinglibrary of dialects that enables the digital assistant to learn from itsusers, in contrast to the frustrating and limited interpretationfeatures provided by conventional digital assistants. Further, becausethe digital assistant and corresponding system is configured with aframework for distributing improvements to its collection of actionableoperations and understandable commands, and because the digitalassistant utilizes applications existing on the computing device of eachuser, privacy concerns typically associated with conventional digitalassistants is significantly reduced.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is described in detail below with reference to theattached drawing figures, wherein:

FIG. 1 is a block diagram of an exemplary computing environment for acrowd-sourced digital assistant system, in accordance with embodimentsof the present invention;

FIG. 2 is a block diagram of an exemplary digital assistant device, inaccordance with an embodiment of the present disclosure;

FIG. 3 is a block diagram of an exemplary digital assistant server, inaccordance with an embodiment of the present disclosure;

FIG. 4 is an exemplary data structure an action dataset, in accordancewith an embodiment of the present disclosure;

FIG. 5 is a flow diagram showing a method for generation an actionableoperation for a crowd-sourced digital assistant network, according tovarious embodiments of the present invention;

FIG. 6 is flow diagram showing a method for on-boarding an actionableoperation for a crowd-sourced digital assistant network, according tovarious embodiments of the present invention;

FIG. 7 is a flow diagram showing a method for reproducing an actionableoperation that was obtained from a crowd-sourced digital assistantnetwork, according to various embodiments of the present invention; and

FIG. 8 is a block diagram of an exemplary computing environment suitablefor use in implementing embodiments of the present invention.

DETAILED DESCRIPTION

The subject matter of the present invention is described withspecificity herein to meet statutory requirements. However, thedescription itself is not intended to limit the scope of this patent.Rather, the inventors have contemplated that the claimed subject mattermight also be embodied in other ways, to include different steps orcombinations of steps similar to the ones described in this document, inconjunction with other present or future technologies. Moreover,although the terms “step” and/or “block” may be used herein to connotedifferent elements of methods employed, the terms should not beinterpreted as implying any particular order among or between varioussteps herein disclosed unless and except when the order of individualsteps is explicitly described.

Aspects of the technology described herein are generally directedtowards systems and methods for crowdsourcing actionable operations of adigital assistant. The described embodiments facilitate the creation,on-boarding, and distribution of action datasets to any number ofcomputing devices having an instance of the digital assistant installedand/or executing thereon (hereinafter referenced as a “digital assistantdevice”). In accordance with the present disclosure, an “operation” cancorrespond to a final result, output, or computing operation that isgenerated, executed, or performed by a digital assistant device based onone or more action datasets selected and interpreted for execution bythe digital assistant device, each action dataset being comprised of oneor more reproducible computing events that can be invoked in response toa received command determined to correspond to the action dataset. Inaccordance with embodiments described herein, an “action” is describedin reference to an operation that is performed in response to an actiondataset selected and interpreted for execution. In this regard, anaction can be performed, invoked, initiated, or executed, among otherthings, and any reference to the foregoing can imply that acorresponding action dataset is selected and interpreted for executionby the digital assistant device to perform the corresponding operation.

In some embodiments, actions (or the action datasets correspondingthereto) can be created, on the computing device via the digitalaassistant device, by recording a series of detected events (e.g.,inputs) that are typically provided by a user of the computing devicewhen manually invoking the desired operation. That is, to create a newaction dataset, the digital assistant device can invoke a recording modewhere a user can simply perform a series of computing operations (e.g.,manual touches, click inputs) within one or more applications to achievea desired result or operation. After the recording is stopped by theuser, via a terminating input, the action dataset caan store and beassociated with a set of command templates corresponding to commandsthat the user would preferably announce to the digital assistant devicewhen an invocation of the operation is desired. In various embodiments,a command representation can be received as speech data and converted totext (e.g., by a speech engine of the digital assistant device), orreceived as text input data. In accordance with embodiments describedherein, a “command” is referenced herein to describe data, received asspeech data or as text data. A “command representation,” on the otherhand is referenced to describe text data that is received, based oninputs (e.g., keyboard), received speech data converted to text data, orreceived text data communicated from another computing device. A“command template” is referenced herein to describe a portion of acommand representation having defined parameter fields in place ofvariable terms.

In more detail, one or more terms or keywords in the received commandcan be defined as a parameter based on input(s) received from the user.A parameter, in accordance with the present disclosure, can bereferenced as corresponding to one of a plurality of predefinedparameter types, such as but not limited to, genre, artist, title,location, name or contact, phone number, address, city, state, country,day, week, month, year, and more. It is also contemplated that thedigital assistant device can access from a memory, or retrieve (e.g.,from a server), a set of predefined parameter types that are known ordetermined to correspond to the application for which an action datasetis being created. In some embodiments, the set of predefined parametertypes can be determined based at least in part on correspondingapplication identifying information. The digital assistant device canextract, based on the defined parameters, the corresponding keywords andgenerate a command template based on the remaining terms and the definedparameters. By way of example only, if the command was originallyreceived as “play music by Coldplay,” and the term “Coldplay” is definedas a parameter of type “artist,” a resulting command template generatedby the digital assistant device may appear as “play music by <artist>”.In this regard, a command template may include the originally receivedcommand terms if no parameters are defined, or may include a portion ofthe originally received command terms with parameter fields definedtherein, the defined parameters corresponding to variable terms of acommand.

The digital assistant device can receive, among other things,application identifying information, a recorded series of events, and aset command templates, among other things, to generate a new actiondataset that can be retrieval, interpreted and/or invoked by the digitalassistant device, simply based on a determination, by the digitalassistant device, that a received command is associated with the actiondataset. When an action is invoked based on a determination that areceived command corresponds to an action dataset, the digital assistantdevice can reproduce (e.g., emulate, invoke, execute, perform) therecorded series of events associated with the corresponding actiondataset, thereby performing the desired operation. Moreover, incircumstances where a received command includes a parameter term, and adetermination is made that the received command corresponds to an actiondataset having a parameter field that also corresponds to the parameterterm, the parameter term can be employed, by the digital assistantdevice, to perform custom operations while performing the action. Forinstance, the digital assistant device can input the parameter term as atext input into a field of the application. In some embodiments, theaction dataset may include a different set of events.

In some further embodiments, an action dataset, once created by thedigital assistant, can be uploaded (hereinafter also referenced as“on-boarded”) to a remote server for storage thereby. The action datasetcan be on-boarded automatically upon its generation or on-boardedmanually based on a received instruction, by the digital assistantdevice. It is contemplated that individuals may want to keep theiractions or command templates private, and so an option to keep an actiondataset limited to locally-storage may be provided to the user (e.g.,via a GUI element). The server, upon receiving an on-boarded actiondataset, can analyze the action dataset and generate an associatedaction signature based on the characteristics and/or contents of theaction dataset. Contents of an action dataset can include, among otherthings, application identifying information, corresponding commandtemplates and parameters, and a recorded series of events. The actionsignature can be generated by various operations, such as hashing theon-boarded action dataset with a hashing algorithm, by way of example.It is also contemplated that the action signature can be generated bythe on-boarding digital assistant device, the generated action signaturethen being stored in or appended to the action dataset before it isuploaded to the server.

In one aspect, the server can determine that the on-boarded actiondataset already exists on the server, based on a determination that theaction signature corresponds to the action signature of another actiondataset already stored on the server. The server can either dispose ofthe on-boarded action dataset or merge the on-boaded action dataset (ordetermined differing portion(s) thereof) with an existing action datasetstored thereby, preventing redundancy and saving storage space. Inanother aspect, the server can analyze the on-boarded action dataset todetermine if its contents (e.g., the recorded events, command templates,metadata) comply with one or more defined policies (e.g., inappropriatelanguage, misdirected operations, incomplete actions) associated withgeneral usage of the digital assistant system. In another aspect, theserver can employ machine learning algorithms, among other things, toperform a variety of tasks, such as determining relevant parametertypes, generating additional command templates for association with anon-boarded or stored action dataset, comparing similarity of eventsbetween on-boarded action datasets to identify and select more efficientroutes for invoking an operation, and more.

In some further embodiments, the server can distribute one or morestored actions datasets to a plurality of digital assistant devices incommunication with the server. In this way, each digital assistantdevice can receive action datasets or portions thereof (e.g., commandtemplates) from the server. The action datasets can be distributed tothe digital assistant devices in a variety of ways. For instance, in anembodiment, the server can freely distribute any or all determinedrelevant action datasets to digital assistant devices. In an embodiment,an application profile including a list of applications installed on adigital assistant device can be communicated to the server. Based on theapplication profile for the digital assistant device, the server candistribute any or all determined relevant action datasets to the digitalassistant device. As digital assistant devices can include a variety ofoperating systems, and versions of applications installed thereon canalso vary, it is contemplated that the application profile communicatedby a digital assistant device to the server may include operating systemand application version information, among other things, so thatappropriate and relevant action datasets are identified by the serverfor distribution to the digital assistant device. For a more granularimplementation, an action dataset profile including a list of actiondatasets or action signatures stored on the digital assistant device canbe communicated to the server. In this way, only missing or updatedaction datasets can be distributed to the digital assistant device.

In some embodiments, a user can simply announce a command to the digitalassistant device, and if a corresponding action dataset is not stored onthe digital assistant device, the digital assistant device can send thecommand (representation) to the server for determination and selectionof a set of relevant action datasets, which can then be communicated tothe digital assistant device. Provided that the digital assistant devicehas the corresponding application installed thereon, the digitalassistant device can retrieve, from the server, a set of determined mostrelevant action datasets, without additional configuration orinteraction by the user, also reducing server load and saving bandwidthby inhibiting extraneous transfer of irrelevant action datasets. Aretrieved set of relevant action datasets can be received from theserver for invocation by the digital assistant device. It is furthercontemplated that if two or more action datasets are determined equallyrelevant to a received command, each action dataset may be retrievedfrom the server, and the digital assistant device can provide br displaya listing of the determined relevant action datasets for selection andexecution.

In some further embodiments, a user of a digital assistant device cancustomize command templates associated with an action datasetcorresponding to an application installed on their digital assistantdevice. Put simply, a user can employ the digital assistant (or a GUIthereof) to select an action dataset a list of action datasets stored onthe computing device, select an option to add a new command to theaction dataset, and define a new command and any associated parametersfor storage in the action dataset. In this regard, the user can add anycustom command and parameter that can later be understood by the digitalassistant device to invoke the action. In some aspects, the customcommand and/or modified action can be on-boarded to the server foranalysis and storage, as noted above. In some further aspects, based onthe analysis, the server can distribute the custom command and/or atleast a portion of the modified action dataset to a plurality of otherdigital assistant devices. In this regard, the list of understandablecommands and corresponding actions can continue to grow and evolve, andbe automatically provided to any other digital assistant device.

Accordingly, at a high level and with reference to FIG. 1 , an exampleoperating environment 100 in which some embodiments of the presentdisclosure may be employed is depicted. It should be understood thatthis and other arrangements and/or features described by the encloseddocument are set forth only as examples. Other arrangements and elements(e.g., machines, interfaces, functions, orders, and groupings offunctions, etc.) or features can be used in addition to or instead ofthose described, and some elements or features may be omitted altogetherfor the sake of clarity. Further, many of the elements or featuresdescribed in the enclosed document may be implemented in one or morecomponents, or as discrete or distributed components or in conjunctionwith other components, and in any suitable combination and location.Various functions described herein as being performed by one or moreentities may be carried out by hardware, firmware, and/or software. Forinstance, some functions may be carried out by a processor executinginstructions stored in memory.

The system in FIG. 1 includes one or more digital assistant devices 110,115 a, 115 b, 115 c, . . . 115 n, in communication with a server 120 viaa network 130 (e.g., the Internet). In this example, the server 120,also in communication with the network 130, is in communication witheach of the digital assistant devices 110, 115 a-115 n, and can also bein communication with a database 140. The database 140 can be directlycoupled to the server 120 or coupled to the server 120 via the network130. The digital assistant device 110, representative of other digitalassistant devices 115 a-115 n, is a computing device comprising one ormore applications 112 and a digital assistant application 114 installedand/or executing thereon.

The one or more applications 112 includes any application that isexecutable on the digital assistant device 110, and can includeapplications installed via an application marketplace, customapplications, web applications, side-loaded applications, applicationsincluded in the operating system of the digital assistant device 110, orany other application that can be reasonably considered to fit thegeneral definition of an application or mobile application. On the otherhand, the digital assistant application 114 can provide digitalassistant services installed on the digital assistant device 110 orprovided by the server 120 via the network 130, or can be implemented atleast partially into an operating system of the digital assistant device110. In accordance with embodiments described herein, the digitalassistant application 114 provides an interface between a digitalassistant device 110 and an associated user (not shown), generally via aspeech-based exchanged, although any other method of exchange betweenuser and digital assistant device 110 (e.g., keyboard input,communication from another digital assistant device or computing device)remains within the purview of the present disclosure.

When voice commands are received by the digital assistant device 110,the digital assistant application 114 can convert the speech command totext utilizing a speech-to-text engine (not shown) to extract identifiedterms and generate a command representation. The digital assistantapplication 114 can receive the command representation, and determinethat the command representation corresponds to at least one commandtemplate of at least one action damsel stored on the digital assistantdevice. In some embodiments, the digital assistant application cangenerate an index of all command templates stored on the digitalassistant device 110 for faster searthing and comparison of the receivedcommand representation to identify a corresponding command template, andthereby a corresponding action dataset. Each indexed command templatecan be mapped to a corresponding action dataset, which can beinterpreted for execution in response to a determination of a confirmedmatch with the received command representation.

By way of brief overview, a command template can include one or morekeywords and/or one or more parameters that each have a correspondingparameter type. Each command template generally corresponds to anoperation that can be performed on one or more applications 112installed on a digital assistant device 110. Moreover, a plurality ofcommand templates can correspond to a single operation, such that thereare multiple equivalent commands that can invoke the same operation. Byway of example only, commands such as “check in,” check into flight,”“please check in,” “check into flight now,” “check in to flight 12345,”and the like, can all invoke the same operation that, by way of exampleonly, directs the digital assistant application 114 to execute anappropriate airline application on the digital assistant device 110 andperform a predefined set of events or computer operations to achieve thesame result.

The aforementioned commands, however, may lack appropriate information(e.g., the specific airline). As one of ordinary skill may appreciate, auser may have multiple applications 112 from various vendors (e.g.,airlines) associated with a similar service (e.g., checking intoflights). A digital assistant device 110 in accordance with embodimentsdescribed herein can provide features that can determine contextualinformation associated with the digital assistant device 110, or itsassociated user, based on historical use of the digital assistant device110, profile information stored on the digital assistant device 110 orserver 120, stored parameters from previous interactions or receivedcommands, indexed messages (e.g., email, text messages) stored on thedigital assistant device, and a variety of other types of data storedlocally or remotely on a server, such as server 120, to identify a mostrelevant parameter and supplement a command to select a most relevantaction dataset. More specific commands, such as “check intoFriendlyAirline flight,” or “FriendlyAirline check in,” and the like,where a parameter is specifically defined in the command, can berecognized by the digital assistant application 114.

One or more recognizable commands and corresponding action datasets canbe received by the digital assistant device 110 from the server 120 atany time, including upon installation, initialization, or invocation ofthe digital assistant application 114, after or upon receipt of a speechcommand by the digital assistant application 114, after or uponinstallation of a new application 112, periodically (e.g., once a day),when pushed to the digital assistant device 110 from the server 120,among many other configurations. It is contemplated that the actiondatasets received by the digital assistant device 110 from the server120 can be limited based at least in part on the applications 112installed on the digital assistant device 110, although configurationswhere a larger or smaller set of action datasets received arecontemplated.

In the even action dataset is determined not available for a particularapplication 112 installed on the digital assistant device 110, digitalassistant application 114 can either redirect the user to a marketplace(e.g., launch an app marketplace application) to install the appropriateapplication determined bytheserver 120 based on the received command, orcan invoke an action training program that prompts a user to manuallyperform tasks on one or more applications to achieve the desired result,the tasks being recorded and stored into a new action dataset by thedigital assistant device 110. The digital assistant application 114 canalso receive one or more commands from the user (e.g., via speech ortext) to associate with the action dataset being generated. If thecommand includes variable parameters (e.g., optional fields), the actiontraining program can facilitate a definition of such parameters andcorresponding parameter types to generate command templates forinclusion in the action dataset being generated. In this way, a commandtemplate(s) is associated with at least the particular applicationdesignated by the user and also corresponds to the one or more tasksmanually performed by the user, associating the generated commandtemplate to the task(s) and thus the desired resulting operation.

In some instances, the server 120 can provide a determined most-relevantaction dataset to the digital assistant device 110 based on the receivedcommand. The server 140 can store and index a constantly-growing andevolving plurality of crowd-sourced action datasets submitted by orreceived from digital assistant devices 115 a-115 n also independentlyhaving a digital assistant application 114 and any number ofapplications 112 installed thereon. The digital assistant devices 115a-115 n may have any combination of applications 112 installed thereon,and any generation of action datasets performed on any digital assistantdevice 110, 115-115 n can be communicated to the server 120 to be storedand indexed for mass or selective deployment, among other things. Insome aspects, the server 120 can include various machine-learnedalgorithms to provide a level of quality assurance on command templatesincluded in on-boarded action datasets and/or the tasks and operationsperformed before they are distributed to other digital assistant devicesvia the network 130.

When the digital assistant application 114 determines an appropriateaction dataset (e.g., one or more tasks to achieve a desired result)having one or more command templates that corresponds to the receivedcommand, the digital assistant application 114 can generate an overlayinterface that can mask any or all visual outputs associated with thedetermined action or the computing device generally. The generation ofthe overlay interface can include a selection, by the digital assistantapplication 114, of one or more user interface elements that are storedin a memory of the digital assistant device 110 or server 120, and/orinclude a dynamic generation of the user interface element(s) by thedigital assistant application 114 or server 120 based on one or moreportions of the received command and/or obtained contextual data (e.g.,determined location data, user profile associated with the digitalassistant device 110 or digital assistant application 114, historicaldata associated with the user profile, etc.) obtained by the digitalassistant device 110, digital assistant application 114, and/or server120. The selected or generated one or more user interface elements caneach include content that is relevant to one or more portions (e.g.,terms, keywords) of the received command. In the event of dynamicgeneration of user interface elements, such elements can be savedlocally on the digital assistant device 110 or remotely on the server120 for subsequent retrieval by the digital assistant device 110, or canbe discarded and dynamically regenerated at any time.

Example operating environment depicted in FIG. 1 is suitable for use inimplementing embodiments of the invention. Generally, environment 100 issuitable for creating, on-boarding, storing, indexing, crowd-sourcing(e.g., distributing), and invoking actions or action datasets, amongother things. Environment 100 includes an on-boarding digital assistantdevice 110, a receiving digital assistant device 120, action cloudserver 130 and network 140. On-boarding digital assistant device 110 andreceiving digital assistant device 120 can be any kind of computingdevice having a digital assistant application installed and/or executingthereon, the digital assistant application being implemented inaccordance with at least some of the described embodiments. For example,in an embodiment, on-boarding digital assistant device 110 and receivingdigital assistant device 120 can be a computing device such as computingdevice X00, as described below with reference to FIG. X. In embodiments,on-boarding digital assistant device 110 and receiving digital assistantdevice 120 can be a personal computer (PC), a laptop computer, aworkstation, a mobile computing device, a PDA, a cell phone, a smartwatch or wearable, or the like. Any digital assistant device describedin accordance with the present disclosure can include features describedwith respect to both the on-boarding digital assistant device 110 andthe receiving digital assistant device 120, and so any reference to adigital assistant device, generally, can include features of both110,120. In this regard, a digital assistant device can include one ormore applications 112 installed and executable thereon. The one or moreapplications (not shown) includes any application that is executable onthe digital assistant device, and can include applications installed viaan application marketplace, custom applications, web applications,side-loaded applications, applications included in the operating systemof the digital assistant device, or any other application that can bereasonably considered to fit the general definition of an application.On the other band, the digital assistant application can be anapplication, a service accessible via an application installed on thedigital assistant device or via the network 140, or implemented into alayer of an operating system of the digital assistant device. Inaccordance with embodiments described herein, the digital assistantapplication can provide an interface between a digital assistant deviceand a user (not shown), generally via a speech-based exchange, althoughany other method of exchange between user and digital assistant devicemay be considered within the purview of the present disclosure.

Similarly, action cloud server 130 (“server”) can be any kind ofcomputing device capable of facilitating the on-boarding, storage,management, and distribution of crowd-sourced action datasets. Forexample, in an embodiment, action cloud server 130 can be a computingdevice such as computing device X00, as described below with referenceto FIG. X. In some embodiments, action cloud server 130 comprises one ormore server computers, whether distributed or otherwise. Generally, anyof the components of environment 100 may communicate with each other viaa network 140, which may include, without limitation, one or more localarea networks (LANs) and/or wide area networks (WANs). Such networkingenvironments are commonplace in offices, enterprise-wide computernetworks, intranets, and the Internet. The server can include or be incommunication with a data source (not shown), which may comprise datasources and/or data systems, configured to make data available to any ofthe various constituents of the operating environment. Data sources maybe discrete from the illustrated components, or any combination thereof,or may be incorporated and/or integrated into at least one of thosecomponents.

Referring now to FIG. 2 , a block diagram 200 of an exemplary digitalassistant device 210 suitable for use in implementing embodiments of theinvention is shown. Generally, digital assistant device 210 (alsodepicted as digital assistant device 110 of FIG. 1 ) is suitable forreceiving commands, selecting action datasets to execute by matchingreceived commands to command templates of action datasets, ordetermining that no action datasets correspond to received commands,interpreting a selected action dataset to execute the associatedoperation, generating new action datasets, and sending action datasetsto or receiving action datasets from a digital assistant server. Digitalassistant device 210 can include, among other things, a commandreceiving component 220, an action matching component 230, an actionexecuting component 240, a training component 250, and a serverinterfacing component 260.

The command receiving component 220 can receive a command, either in theform of speech data or text data. The speech data can be received via amicrophone of the digital assistant device 210, or another computingdevice paired to or in communication with the digital assistant device210. The command receiving component 220, after receiving the speechdata, can employ a speech-to-test engine of the digital assistant device210 to generate a command representation (e.g., a text string of thecommand). Text data received by command receiving component 220, on theother hand, can be received via a virtual keyboard or other input methodof the digital assistant device 210, and similarly, can be received fromanother computing device paired to or in communication with the digitalassistant device 210. Received text data is already in the form of acommand representation, and is treated as such. In various embodiments,command receiving component 210 can be invoked manually by a user (e.g.,via an input to begin listening for or receiving the command), or can bein an always-listening mode.

Based on a command representation being received, action matchingcomponent 230 can determine whether one or more action datasets storedon the digital assistant device 210 include a command template thatcorresponds to or substantially corresponds (e.g., at least 90% similar)to the received command representation. In some aspects, a correspondingcommand template can be identified, and the action dataset of which thecorresponding command template is stored in is selected forinterpretation by action executing component 240. In some other aspects,a corresponding command template cannot be identified, and either thetraining component 250 can be invoked, or the received command iscommunicated to the digital assistant server (depicted as server 120 ofFIG. 1 and digital assistant server 310 of FIG. 3 ) via the serverinterfacing component 260.

The action executing component 240 can receive a selected actiondataset, either selected by digital assistant device 210 from localstorage, by the digital assistant server from storage accessiblethereto, or selected from a list presented by digital assistant device210. The action executing component 240 can, from the received actiondataset, interpret event data, which may include executable code, links,deep links, references to GUI elements, references to screencoordinates, field names, or other pieces of data that can correspond toone or more tasks or events stored in the selected action dataset. Whenthe event data is interpreted, the action executing component 240 canreproduce the events that were recorded when the action dataset wasinitially generated, by any digital assistant device such as digitalassistant device 210. In some aspects, the event data can include timedelays, URLs, deep links to application operations, or any otheroperation that can be accessed, processed, emulated, or executed by theaction executing component 240. In some aspects, events like click ortouch inputs, can be reproduced on the digital assistant device 210,based on the interpreted event data stored in an action dataset.

The training component 250 can facilitate the generation of an actiondataset, among other things. When the training component 250 is invoked,an indication, such as a GUI element, indicating that an actionrecording session has begun may be presented for display. A prompt toprovide the tasks or events required to perform the desired operationcan also be presented for display. In this regard, a user can begin byfirst launching an application for which the operation is associatedwith, and proceed with providing inputs to the application (i.e.,(performing the requisite tasks). The inputs can be recorded by thedigital assistant device 210, and the training component 250 can listenfor, parse, identify, and record a variety of attributes of the receivedinputs, such as long or short presses, time delays between inputs,references to GUI elements interacted with, field identifiers,application links activated based on received inputs (e.g., deep links),and the like. The recorded inputs and attributes (e.g., event data) canbe stored, sequentially, in an event sequence, and stored into a newaction dataset. The application launched is also identified, and anyapplication identifying information, such as operating system, operatingsystem version, application version, paid or free version status, andmore, can be determined from associated metadata and also stored intothe new action dataset. When the desired operation is completed (i.e.,all requisite tasks/events performed), a user can activate a trainingtermination button, which can be presented as a floating button or otherinput mechanism that is preferably positioned away from an activeportion of the display. Other termination methods are also contemplated,such as voice activated termination, or motion activated termination,without limitation.

The training component 250 can further request that the user provide aset of commands that correspond to the desired operation. A command canbe received via speech data and converted to a command representation bya speech to text engine, or received via text input as a commandrepresentation, among other ways. When the set of commands is providedand stored as command representations, the training component 250 canfurther prompt the user to define any relevant parameters or variablesin the command representations, which can correspond to keywords orvalues that may change whenever the command is spoken. In this regard, auser may select one or more terms included in the received commandrepresentations, and define them with a corresponding parameter typeselected from a list of custom, predefined, or determined parametertimes, as described herein. The training component 250 can then extractthe selected one or more terms from a command representation defined asparameter(s), replacing them with parameter field identifier(s) of acorresponding parameter type, and store the resulting data as a commandtemplate. The training component 250 can then generate the actiondataset from the recorded event sequence, the application identifyinginformation, and the one or more defined command templates. In someembodiments, the training component 250 can generate an action signatureor unique hash based on the generated action dataset or one or moreportions of data included therein. The action signature can be employedby the digital assistant server to determine whether the action datasetor data included therein is redundant, among other things.

Looking now to FIG. 3 , a block diagram 300 of an exemplary digitalassistant server 310 suitable for use in implementing embodiments of theinvention is shown. Generally, digital assistant server 310 (alsodepicted as server 120 of FIG. 1 ) is suitable for establishingconnections with digital assistant device(s) 210, receiving generatedaction datasets, maintaining or indexing received action datasets,receiving commands to determine one or more most relevant actiondatasets for selection and communication to a sending digital assistantdevice of the command, and distributing action datasets to other digitalassistant devices, such as digital assistant device 210. Digitalassistant server 310 can include, among other things, an on-boardingcomponent 320, an indexing component 330, a maintenance component 340, arelevant component 350, and a distribution component 360, among otherthings.

The on-boarding component 320 can receive action datasets generated byone or more digital assistant devices 210 in communication therewith. Insome aspects, the on-boarding component can generate an action signaturefor a received action dataset, similar to how a digital assistant devicemay, as described herein above. Before storing the received actiondataset, the action signature can be searched utilizing the indexingcomponent 330, which maintains an index of all action datasets stored bythe digital assistant server 310. The indexing component 330 facilitatesquick determination of uniqueness of received action datasets, andreduces redundancy and processing load of the digital assistant server310.

On a similar note, the maintenance component 340 can determine whetherany portion of a received action dataset is different than actiondatasets already stored on or by the server (e.g., in a database), andextract such portions for merging into the existing corresponding actiondatasets. Such portions may be identified in circumstances where onlycommand templates are hashed in the action signature, or where eachportion of the action dataset (e.g., application identifyinginformation, command template(s), event sequence) is independentlyhashed either by training component 240 of FIG. 2 or on-boardingcomponent 310 of FIG. 3 , to more easily identify changes or differencesbetween action datasets. By way of example, in some embodiments, areceived action dataset can include separate hashes for its applicationidentifying information, event sequence, and command template(s). Inthis regard, the digital assistant server 310 can employ the indexingcomponent 320 and maintenance component 330 to quickly identify, forinstance, that the received action data corresponds to a particularapplication and operation, but that the command template(s) aredifferent than those stored in the stored action dataset by virtue ofthe command template hashes being different. Similarly, the independenthash signatures for each portion of data included in an action datasetcan facilitate efficient determination of changes or differences betweenany combination of data portions in a received action dataset and astored action dataset.

Relevance component 350 can determine, based on commands received by adigital assistant device 210, a likelihood that a particular commandtemplate corresponds to the received command. While a variety ofrelevance determining methods may be employed, a machine learningimplementation may be preferable, though a ranking of determined mostsimilar command templates to a command received from a digital assistantdevice 210 can also facilitate a determination of relevance andtherefore one or more most relevant command templates. Determinedmost-relevant command templates can thereby facilitate the selection ofa most relevant action dataset to be distributed to the command-sendingdigital assistant device 210.

The distribution component 360 can distribute or communicate to one ormore digital assistant devices 210, determined relevant or most relevantaction datasets, determined new action datasets, determined updatedaction datasets, any portion and/or combination of the foregoing, orgenerated notifications corresponding to any portion and/or combinationof the foregoing, among other things, based on a variety of factors. Forinstance, the distribution component 360 can include features thatdetermine, among other things, which applications are installed on adigital assistant device 210. Such features can enable the digitalassistant server 310 to determine which action datasets or portionsthereof are relevant to the digital assistant device 210, and should bedistributed to the digital assistant device 210. For instance, a digitalassistant device 210 profile (not shown) describing all applicationscurrently installed or executable by a digital assistant device 210, canbe maintained (e.g., stored, updated) by the digital assistant server310. The profile can be updated periodically, manually, or dynamicallyby a server interfacing component 280 of the digital assistant device210 (e.g., whenever the digital assistant is in communication with andsends a command to the digital assistant server 310, or whenever anapplication is installed or updated on the digital assistant device210). The distribution component 360 can distribute or communicatenotifications, action datasets, or portions thereof, in a variety ofways, such as pushing, sending in response to received requests forupdates, sending in response to established communications with adigital assistant device 210, or by automatic wide scale (e.g., alldigital assistant devices) or selective scale (e.g., region, location,app type, app name, app version) distribution, among other things.

Turning now to FIG. 4 , a data structure 400 of an exemplary actiondataset 410 in accordance with some of the described embodiments isillustrated. The depicted data structure is not intended to be limitingin any way, and any configuration of the depicted data portions ofinformation may be within the purview of the present disclosure.Further, additional data portions or less data portions may be includedin an action dataset 410 also remaining within the purview of thepresent disclosure.

In the depicted data structure 400, the action dataset 410 includesapplication identifying information 420, recorded event sequence data430, and command templates 440. In some embodiments, the action dataset410 further includes hash(es) 450, which can include a hash valuegenerated based on the entire action dataset 410, or hash valuesgenerated based on any portion of the aforementioned data portions 420,430, 440, among other things. The action dataset 410 can be generated bytraining component 250 of digital assistant device 210 of FIG. 2 and/orreceived from distribution component 360 of digital assistant server 310of FIG. 3 .

The application identifying information 420 can include informationabout a particular application that is required for execution to performa particular operation for which the action dataset 410 was created.Exemplary pieces of application identifying information 420 are depictedin identifying information 425, which can include any one or more of anoperating system (OS) name for which the particular application isexecuted on, an OS version of the aforementioned OS, a defined nativelanguage of the aforementioned OS, a name of the particular application,a version of the particular application, and the like. It iscontemplated that the application identifying information 420 isrequired and checked (e.g., by the digital assistant server 310 of FIG.3 ), before an action dataset 410 is distributed to a digital assistantdevice (e.g., digital assistant device 210 of FIG. 2 ) and employed bythe digital assistant device to ensure that the action dataset 410 iscompatible with, or can be correctly interpreted by action executingcomponent 240 of FIG. 2 , so that the corresponding and desiredoperation is performed by the digital assistant device 210.

The recorded event sequence data 430 can include any or all task orevent-related data that was obtained, received, or determined by thedigital assistant device (e.g., via training component 250 of FIG. 2 )responsible for generating the action dataset 410. As noted herein, therecorded event sequence data can include timing attributes of receivedinputs (e.g., delays before or in between successive inputs, duration ofinputs, GUI elements interacted with, relative positions of GUIelements, labels or metadata of GUI elements scroll inputs anddistances, links or URLs accessed activated, detected activation ofapplication deep links activated in response to received inputs, andmore). In some instances, the recorded event sequence data 430 mayinclude conditions that require actual user intervention beforesubsequent events or tasks are resumed. For instance, secured loginscreens may require that a users input username and password informationbefore an application is executed. In this regard, the recorded eventsequence data 430 may include a condition corresponding to when userauthentication has occurred, and instructions (e.g., interpretable byaction executing component 240) to proceed with the tasks or events inthe recorded event sequence data 430 based upon an occurrence of thecondition. In various implementations, it is contemplated that theaction executing component 240 of FIG. 2 can parse metadata, GUIelements, or other information from an executing application todetermine when certain events occur or conditions are met. In thisregard, additional conditions may be included in the recorded eventsequence data 430 that require prior events or tasks to be completed, orcertain GUI elements or conditions to be met, before subsequent eventsor tasks are performed by the action executing component 240 of FIG. 2 .

Turning now to FIG. 5 , an embodiment of a proces flow or method 500 forgenerating actionable operations for a crowd-sourced digital assistantnetwork is described. At step 510, a training mode of a digitalassistant device is initialized. The training mode is a mode of thedigital assistant application executing on the digital assistant device,the digital assistant application being either an independently runningapplication or an application embedded in an operating system of thedigital assistant device. The training mode is intended to initialize aprocess for generating an action dataset that can be interpreted by anydigital assistant device on the crowd-sourced digital assistant networkto reproduce a desired operation facilitated by a recorded sequence ofevents or tasks.

At step 520, a set of input events, application events or applicationtasks corresponding to an installed application of the digital assistantdevice is recorded, in sequence, as the events and tasks occur. Variousaspects of the events and tasks can be recorded, such as duration,delays, conditions, and more as described herein above.

At step 530, a set of commands to generate a set of command templatescan be received by the digital assistant device. A command can bereceived as speech data via a microphone of the digital assistantdevice. The received speech data can then be converted to text data byemploying a speech-to-text engine installed on or accessible by thedigital assistant device. The converted text data corresponds to thecommand representation in accordance with embodiments described herein.In some embodiments, the command representation may include parametersthat correspond to variables that may be included or spoken in areceived command. The portion of the command representationcorresponding to a parameter can be defined, by a user, and having adefined parameter type in accordance with embodiments described herein.The parameter value of the received command representation can beextracted, and replaced with a reference to the defined parameter type.In this way, the resulting command representation with one or moreparameter types in place of any selected parameter values, provided by auser, can be stored as a command template.

At step 540, an action dataset can be generated based on the recordedevents and tasks, the received set of commands or generated commandtemplates derived therefrom, and information that identifies theapplication on which the events and tasks were performed. Thisapplication identifying information can be determined by the digitalassistant device based on extracted metadata or an application listmaintained by the digital assistant device.

Turning now to FIG. 6 , an embodiment of a process flown or method 600for on-boarding action datasets onto a server for a crowd-sourceddigital assistant network is described. At step 610, a server canreceive an action dataset generated by a digital assistant device. Atstep 620, the server can determine that at least a portion of thereceived action dataset is different that a plurality of stored actiondataset portions maintained and/or indexed by the server. In someaspects, the server can index each received action dataset or portionbased on generated hashes that each correspond to the received actiondataset or portion. In this regard, the index can be employed to quicklyidentify whether received action datasets or portions thereof are new,the same, or have been changed or updated in comparison to actiondatasets stored and/or maintained by the server.

At step 630, the server can store at least the received portion of theaction dataset based on the determination that it is different than theones stored on the server. In some instances, if only a portion of anaction dataset is determined different, then the different portion canbe merged into the action dataset stored and/or maintained by theserver. Portions can correspond to command templates, events, tasks, andmore.

At step 640, the server can distribute at least the determined differentportion to one or more digital assistant devices in communication withthe server. In some instances, the server can determine which digitalassistant devices have an application that corresponds to the updated orchanged action dataset, and push the updated or changed action datasetto such digital assistant devices. In some other instances, the digitalassistant devices can request reception of updated, new, or changedaction datasets upon request or connection to the server, among otherways.

Turning now to FIG. 7 , an embodiment of a process flow or method 700for reproducing a desired operation corresponding to an action datasetsourced by a crowd-sourced digital assistant network is described. Atstep 710, a digital assistant device can receive command representation.The command representation can be generated based on received speechdata, or based on other data communicated from another computing device.The digital assistant device can determine that the received commandrepresentation corresponds to an action dataset stored thereon, orstored on a server of the crowd-sourced digital assistant network. Thedetermination can be made utilizing a variety of operations describedherein. Based on the determination that an action dataset corresponds tothe received command representation, the digital assistant device canretrieve or obtain the corresponding action dataset, from memory or fromthe server. The digital assistant device can, at step 730, interpret thecorresponding action dataset to reproduce an operation intended to beperformed based on the recorded set of events or tasks stored in theaction data set.

Having described various embodiments of the invention, an exemplarycomputing environment suitable for implementing embodiments of theinvention is now described. With reference to FIG. 8 , an exemplarycomputing device is provided and referred to generally as computingdevice 800. The computing device 800 is but one example of a suitablecomputing environment and is not intended to suggest any limitation asto the scope of use or functionality of the invention. Neither shouldthe computing device 800 be interpreted as having any dependency orrequirement relating to any one or combination of componentsillustrated.

Embodiments of the invention may be described in the general context ofcomputer code or machine-useable instructions, includingcomputer-useable or computer-executable instructions, such as programmodules, being executed by a computer or other machine, such as apersonal data assistant, a smartphone, a tablet PC, or other handhelddevice. Generally, program modules, including routines, programs,objects, components, data structures, and the like, refer to code thatperforms particular tasks or implements particular abstract data types.Embodiments of the invention may be practiced in a variety of systemconfigurations, including handheld devices, consumer electronics,general-purpose computers, more specialty computing devices, etc.Embodiments of the invention may also be practiced in distributedcomputing environments where tasks are performed by remote-processingdevices that are linked through a communications network. In adistributed computing environment, program modules may be located inboth local and remote computer storage media including memory storagedevices.

With reference to FIG. 8 , computing device 800 includes a bus 810 thatdirectly or indirectly couples the following devices: memory 812, one ormore processors 814, one or more presentation components 816, one ormore input/output (I/O) ports 818, one or more I/O components 820, andan illustrative power supply 822. Bus 810 represents what may be one ormore busses (such as an address bus, data bus, or combination thereof).Although the various blocks of FIG. 8 are shown with lines for the sakeof clarity, in reality, these blocks represent logical, not necessarilyactual, components. For example, one may consider a presentationcomponent such as a display device to be an I/O component. Also,processors have memory. The inventors hereof recognize that such is thenature of the art and reiterate that the diagram of FIG. 8 is merelyillustrative of an exemplary computing device that can be used inconnection with one or more embodiments of the present invention.Distinction is not made between such categories as “workstation,”“server,” “laptop,” “handheld device,” etc., as all are contemplatedwithin the scope of FIG. 8 and with reference to “computing device.”

Computing device 800 typically includes a variety of computer-readablemedia. Computer-readable media can be any available media that can beaccessed by computing device 800 and includes both volatile andnonvolatile media, removable and non-removable media. By way of example,and not limitation, computer-readable media may comprise computerstorage media and communication media. Computer storage media includesboth volatile and nonvolatile, removable and non-removable mediaimplemented in any method or technology for storage of information suchas computer-readable instructions, data structures, program modules, orother data. Computer storage media includes, but is not limited to, RAM,ROM, EEPROM, flash memory or other memory technology, CD-ROM, digitalversatile disks (DVDs) or other optical disk storage, magneticcassettes, magnetic tape, magnetic disk storage or other magneticstorage devices, or any other medium which can be use to store thedesired information and which can be accessed by computing device 800.Computer storage media does not comprise signals per se. Communicationmedia typically embodies computer-readable instructions, datastructures, program modules, or other data in a modulated data signalsuch as a carrier wave or other transport mechanism and includes anyinformation delivery media. The term “modulated data signal” means asignal that has one or more of its characteristics set or changed insuch a manner as to encode information in the signal. By way of example,and not limitation, communication media includes wired media, such as awired network or direct-wired connection, and wireless media, such asacoustic, RF, infrared, and other wireless media. Combinations of any ofthe above should also be included within the scope of computer-readablemedia.

Memory 812 includes computer storage media in the form of volatileand/or nonvolatile memory. The memory may be removable, non-removable,or a combination thereof. Exemplary hardware devices include solid-statememory, hard drives, optical-disc drives, etc. Computing device 800includes one or more processors 814 that read data from various entitiessuch as memory 812 or I/O components 820. Presentation component(s) 816presents data indications to a user or other device. Exemplarypresentation components include a display device, speaker, printingcomponent, vibrating component, and the like.

The I/O ports 818 allow computing device 800 to be logically coupled toother devices, including I/O components 820, some of which may be builtin. Illustrative components include a microphone, joystick, game pad,satellite dish, scanner, printer, wireless device, etc. The I/Ocomponents 820 may provide a natural user interface (NUI) that processesair gestures, voice, or other physiological inputs generated by a user.In some instances, inputs may be transmitted to an appropriate networkelement for further processing. An NUI may implement any combination ofspeech recognition, touch and stylus recognition, facial recognition,biometric recognition, gesture recognition both on screen and adjacentto the screen, air gestures, head and eye tracking, and touchrecognition associated with displays on the computing device 800. Thecomputing device 800 may be equipped with depth cameras, such asstereoscopic camera systems, infrared camera systems, RGB camerasystems, and combinations of these, for gesture detection andrecognition. Additionally, the computing device 800 may be equipped withaccelerometers or gyroscopes that enable detection of motion. The outputof the accelerometers or gyroscopes may be provided to the display ofthe computing device 800 to render immersive augmented reality orvirtual reality.

Some embodiments of computing device 800 may include one or moreradio(s) 824 (or similar wireless communication components). The radio824 transmits and receives radio or wireless communications. Thecomputing device 800 may be a wireless terminal adapted to receivecommunications and media over various wireless networks. Computingdevice 800 may communicate via wireless protocols, such as code divisionmultiple access (“CDMA”), global system for mobiles (“GSM”), or timedivision multiple access (“TDMA”), as well as others, to communicatewith other devices. The radio communications may be a short-rangeconnection, a long-range connection, or a combination of both ashort-range and a long-range wireless telecommunications connection.When we refer to “short” and “long” types of connections, we do not meanto refer to the spatial relation between two devices. Instead, we aregenerally referring to short range and long range as differentcategories, or types, of connections (i.e., a primary connection aandsecondary connection). A short-range connection may include, by way ofexample and not limitation, a Wi-Fi® connection to a device (e.g.,mobile hotspot) that provides access to a wireless communicationsnetwork, such as a WLAN connection using the 802.11 protocol; aBluetooth connection to another computing device is a second example ofa short-range connection, or a near-field communication connection. Along-range connection may include a connection using, by way of exampleand not limitation, one or more of CDMA, GPRS, GSM, TDMA, and 802.16protocols.

Many different arrangements of the various components depicted, as wellas components not shown, are possible without departing from the scopeof the claims below. Embodiments of the present invention have beendescribed with the intent to be illustrative rather than restrictive.Alternative embodiments will become apparent to readers of thisdisclosure after and because of reading it. Alternative means ofimplementing the aforementioned can be completed without departing fromthe scope of the claims below. Certain features and sub-combinations areof utility and may be employed without reference to other features andsub-combinations and are contemplated within the scope of the claims.

What is claimed is:
 1. A computer-implemented method performed by adigital assistant device, the method comprising: generating an actiondataset for an application of the digital assistant device, by:recording, at the digital assistance device, a set of received inputsand application events that correspond to an application installed onthe digital assistant device; receiving, by the digital assistantdevice, a set of commands to generate a set of command templatesassociated with the application installed on the digital assistantdevice; communicating, by the digital assistant device, the generatedaction dataset to a remote server for distribution to other digitalassistant devices.
 2. The computer-implemented method of claim 1,wherein the generated action dataset causes the application to performan operation in response the digital assistant device receiving acommand that matches the received set of commands of the generatedaction dataset.
 3. The computer-implemented method of claim 1, whereinthe set of received inputs includes at least one touch input event. 4.The computer-implemented method of claim 1, wherein the applicationevents include a detected GUI element, detected metadata, a detectedstate, or a detected operation for the application.
 5. Thecomputer-implemented method of claim 1, wherein the set of receivedinputs and application events are recorded in sequence.
 6. Thecomputer-implemented method of claim wherein the set of commandsincludes a set of command representations that include a correspondingstring of text.
 7. The computer-implemented method of claim 6, is hereinthe corresponding string of text is generated based on converted speechdata received by the digital assistant device.
 8. Thecomputer-implemented method of claim 1, wherein the generated actiondataset is associated with causing the digital assistant device toperform a particular operation or invoke a particular output of theapplication.
 9. The computer-implemented method of claim 1, wherein theaction datase includes identifying information for the application. 10.The computer-implemented method of claim 9, wherein the applicationidentifying information includes a name of the application and a versionof the application.
 11. The computer-implemented method of claim 9,wherein the application identifying information includes a name of anoperating system installed on the digital assistant device.
 12. Acomputer-implemented method, comprising: receiving, at a server device,an action dataset generated by a digital assistant device and associatedwith an application of the digital assistant device; determining, at theserver device, that at least a portion of the received action dataset isdifferent than other action dataset portions stored in memory of theserver device; storing, by the server device, the portion of thereceived action dataset to the memory of the service device; identifyingother digital assistance devices that have the application of thedigital assistance device; and distributing, by the server device, theportion of the received action dataset to the identified other digitalassistant devices.
 13. The computer-implemented method of claim 12,wherein the action dataset includes application identifying informationfor the application.
 14. The computer-implemented method of claim 12,wherein at least the portion of the received action dataset isdetermined to be different than other action dataset portions based on ahash generated for the portion of the received action dataset notmatching hashes generated for the other action dataset portions storedin the memory.
 15. A non-transitory, computer readable medium whosecontents, when executed by a computing system of a digital assistantdevice, causes the computing system to perform a method, the methodcomprising: receiving a command representation; determining that thecommand representation corresponds to a command template stored in anaction dataset, wherein the action dataset is indexed by the digitalassistant device or a server of a crowd-sourced digital assistantnetwork associated with multiple other digital assistant devices;receiving the action dataset that stores the command templatecorresponding to the received command representation; and reproducing,via an application of the digital assistant device, a set of recordedevents or tasks stored by the received action dataset by interpretingthe received action dataset.
 16. The non-transitory, computer readablemedium of claim 15, wherein the command representation is generated by aspeech-to-text engine of the digital assistant device based on receivedspeech data.
 17. The non-transitory, computer readable medium of claim15, wherein the action dataset is indexed based on a hash generated forthe action dataset.
 18. The non-transitory, computer readable medium ofclaim 15, wherein the action dataset is one of multiple indexed actiondatasets stored on the digital assistant device or the server.
 19. Thenon-transitory, computer readable medium of claim 15, wherein the set ofrecorded events or tasks corresponds to the application installed on thedigital assistant device.
 20. The non-transitory, computer readablemedium of claim 15, wherein the action dataset is received based on adetermination that the digital assistant device has installed theapplication.